Computer Science Carnegie Mellon DISTRIBUTION
نویسندگان
چکیده
When given several problems to solve in some domain, a standard reinforcement learner learns an optimal policy from scratch for each problem. If the domain has particular characteristics that are goal and problem independent, the learner might be able to take advantage of previously solved problems. Unfortunately, it is generally infeasible to directly apply a learned policy to new problems. This paper presents a method to bias exploration through previous problem solutions, which is shown to speed up learning on new problems. We first allow a Q-learner to learn the optimal policies for several problems. We describe each state in terms of local features, assuming that these state features together with the learned policies can be used to abstract out the domain characteristics from the specific layout of states and rewards in a particular problem. We then use a classifier to learn this abstraction by using training examples extracted from each learned Q-table. The trained classifier maps state features to the potentially goal-independent successful actions in the domain. Given a new problem, we include the output of the classifier as an exploration bias to improve the rate of convergence of the reinforcement learner. We have validated our approach empirically. In this paper, we report results within the complex domain Sokoban which we introduce.
منابع مشابه
Soar as a Unified Theory of Cognition: Spring 1990
Richard L. Lewis School of Computer Science, Carnegie Mellon University Scott B. Huffman Department of Electrical Engineering and Computer Science, University of Michigan Bonnie E. John School of Computer Science, Carnegie Mellon University John E. Laird Department of Electrical Engineering and Computer Science, University of Michigan Jill Fain Lehman School of Computer Science, Carnegie Mellon...
متن کاملAn Algorithm to Compute the Stochastically Stable Distribution of a Perturbed Markov Matrix
of “An Algorithm to Compute the Stochastically Stable Distribution of a Perturbed Markov Matrix” by John R. Wicks, Ph.D., Brown University, August 2008. Recently, some researchers have attempted to exploit state-aggregation techniques to compute stable distributions of high-dimensional Markov matrices (Gambin and Pokarowski, 2001). While these researchers have devised an efficient, recursive al...
متن کاملA Principled Massive-Graph Similarity Function with Attribution
Danai Koutra, Computer Science and Engineering, University of Michigan, Ann Arbor1 Neil Shah, Computer Science Department, Carnegie Mellon University. Joshua T. Vogelstein, Department of Biomedical Engineering & Institute of Computational Medicine, Johns Hopkins University Child Mind Institute. Brian Gallagher, Lawrence Livermore National Laboratory. Christos Faloutsos, Computer Science Departm...
متن کاملStructure and magnetic properties of L10-FePt thin films on TiN/RuAl underlayers
underlayers En Yang, Sutatch Ratanaphan, Jian-Gang Zhu, and David E. Laughlin Data Storage Systems Center, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA ABB Professor of Engineering Department of Electrical and Computer Engineering, Data Storage Systems Center, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213 USA ALCOA Professor of Physical Metallurgy Materials Scien...
متن کاملCMurfs: Carnegie Mellon United Robots for Soccer
In RoboCup’2009 SPL, Carnegie Mellon University participated as CMWrEagle, a joint team between the University of Science and Technology of China, led by Professor Xiaoping Chen, and Carnegie Mellon University, led by Professor Manuela Veloso. For RoboCup’2010, Carnegie Mellon University will be participating as a sole team: CMurfs Carnegie Mellon United Robots for Soccer. We are investigating ...
متن کاملAbstraction and Counterexample-Guided Refinement in Model Checking of Hybrid Systems
ion and Counterexample-Guided Refinement in Model Checking of Hybrid Systems∗ Edmund Clarke, Ansgar Fehnker, Zhi Han, Bruce Krogh, Joël Ouaknine, Olaf Stursberg, Michael Theobald 1 Computer Science Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA 2 Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, USA 3 Process Control Lab, University of Dor...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999